Smartphone-based holographic measurement of polydisperse suspended particulate matter with various mass concentration ratios

Real-time monitoring of suspended particulate matter (PM) has become essential in daily life due to the adverse effects of long-term exposure to PMs on human health and ecosystems. However, conventional techniques for measuring micro-scale particulates commonly require expensive instruments. In this study, a smartphone-based device is developed for real-time monitoring of suspended PMs by integrating a smartphone-based digital holographic microscopy (S-DHM) and deep learning algorithms. The proposed S-DHM-based PM monitoring device is composed of affordable commercial optical components and a smartphone. Overall procedures including digital image processing, deep learning training, and correction process are optimized to minimize the prediction error and computational cost. The proposed device can rapidly measure the mass concentrations of coarse and fine PMs from holographic speckle patterns of suspended polydisperse PMs in water with measurement errors of 22.8 ± 18.1% and 13.5 ± 9.8%, respectively. With further advances in data acquisition and deep learning training, this study would contribute to the development of hand-held devices for monitoring polydisperse non-spherical pollutants suspended in various media.

Digital holographic microscopy (DHM) is an effective 3D imaging technique for volumetric measurement of various micro-scale particles, such as microorganisms 27 , microplastics 28 , and PMs 29,30 . Conventional DHM techniques use numerical reconstruction and autofocusing algorithms to extract 3D locational and morphological information of test particles from recorded 2D holographic images [31][32][33] . However, these techniques require advanced optical alignment, high computational cost, and large data storage. With the aid of the rapidly growing artificial intelligence (AI) technology 34 , DHM has been extensively combined with AI to overcome the technical limitations of conventional techniques to improve spatial resolution, reduce noises, and minimize computational cost [35][36][37][38] . For a cost-effective device for quantitative analysis of various particulates, smartphone-based DHM (S-DHM) combined with AI technology has been investigated for biomedical diagnosis and environmental monitoring [39][40][41] . In our previous study, a compact S-DHM-based device for high PM concentration measurement was developed based on holographic speckle patterns of PMs and deep learning 41 . Holographic speckle patterns vary depending on the optical characteristics and concentrations of test particles 42,43 . The mass concentrations of airborne PMs can be directly predicted by training deep learning algorithms with their holographic speckle images and the corresponding PM concentration labels. However, this technique requires further improvement in digital image processing and deep learning network to obtain accurate measurements of polydisperse PMs.
In this study, a real-time, non-invasive, and cost-effective device is proposed to measure the concentrations of highly concentrated polydisperse PMs suspended in water. This hand-held device is developed by integrating a S-DHM system and a deep learning network called Holo-SpeckleNet 41 . The developed S-DHM is used to acquire holographic speckle patterns of PMs, consisting of various mass concentration ratios of coarse and fine PMs (PM c and PM f , respectively). The recorded holographic images are converted into trainable datasets by applying optimized digital image processing. The deep learning network which consists of a deep autoencoder (DAE) and regression layers is trained using the converted speckle images of monodisperse PM c and PM f samples and the corresponding ground truth concentration labels. The network trained with the monodisperse PM samples can selectively predict PM c and PM f concentrations from the speckle images of polydisperse PM mixture without time-consuming computational procedures required in conventional DHM techniques. The proposed S-DHMbased device can be effectively utilized to measure the absolute mass concentrations of polydisperse PMs with cost-effective optical components (~ $390) and low computational cost for training procedure.

Methods
Sample preparation. Arizona test dust (ISO 12103-1) is used to make the PM test samples. The average diameters of PM f and PM c are 1.256 ± 1.309 µm and 7.657 ± 1.286 µm, respectively. The size distributions of PMs are measured by using the Multisizer 3 particle counter (Beckman Coulter, USA). The polydispersity index (PDI) is defined as the square of mean diameter of PMs divided by their standard deviation. The PDI values of PM f and PM c are 1.086 and 0.0282, respectively. Two reference samples of PM f and PM c with mass concentration values of 20 µg/ml are prepared by mixing PMs and distilled water. 20 mg of PM is measured by using an electronic balance (CAS, Korea) and then mixed with 1L of distilled water in a 1L wide neck bottle (Azlon, UK) to make two reference samples of PM f and PM c . The two reference samples and distilled water are then mixed in 30 ml square-shaped transparent bottles (Triforest, USA) at different volume ratios using 10 ml sterile syringes (Shinchang Medical, Korea) to prepare PM test samples with various concentration ratios of PM f and PM c .
S-DHM-based device for measuring polydisperse PM. Figure 1a shows a schematic of the proposed S-DHM system used for measuring suspended PM particles. A coherent beam is generated from a laser diode (λ = 532 nm, 20 mW) connected to a rechargeable lithium-ion battery (4.5 V). The laser beam is expanded to make a quasi-uniform collimated beam with the aid of an aspheric lens (f = 3.1 mm) and a plano-convex lens (f = 40 mm). Holographic speckle pattern is magnified by using a 20 × objective lens (NA = 0.4, f = 9 mm, Newport, USA). Holograms of PM particles are then recorded by using a Samsung Galaxy S10 + smartphone (Samsung, Korea). Consecutive holograms are recorded within 0.4 s by adopting the super-slow-motion mode (1280 × 720 pixels; 960 fps) of the smartphone of which the magnified pixel size is 250 nm. The built-in back camera module of the smartphone is slightly modified by removing the auto-focusing lens to avoid spherical aberration. The aligned optical components are mounted on a 3D printed housing (242 mm × 110 mm × 70 mm) (Fig. 1b).
The test samples are the mixtures of PM c and PM f (Fig. 1c). PM c has particle diameters in the range from 5 to 10 μm, while PM f has particle diameters less than 3 μm. PM c and PM f are suspended in water contained in 30 ml bottles with mass concentrations in the range of 1-8 µg/ml to produce monodisperse PM c and PM f samples, respectively. These two types of PM particles are mixed in the bottles to prepare polydisperse PM samples with various concentration ratios. Because each test sample is placed close to the objective lens, the focal plane of the lens is located 5 mm away from the wall of the bottle within the volume of water. The prepared test samples are used to acquire holographic speckles of monodisperse and polydisperse PM particles. Figure 1d shows the overall procedure of the proposed S-DHM-based technique for measuring suspended PM particles. Holograms and the corresponding ground truth concentration labels of PM particles are acquired at first. The optimized digital image processing technique is then applied to convert the recorded holograms into trainable datasets. The converted images and the corresponding ground truth PM concentrations are classified in pairs into the training, validation, and test datasets. These datasets are used to train and optimize the artificial neural network composed of a DAE 44 and three regression layers. Finally, the PM c and PM f concentrations are evaluated from the acquired holograms of PM samples.
Raw holographic speckle images are converted into trainable datasets after applying the digital image processing (Fig. 1e). Holographic speckle patterns of test samples are recorded with the super-slow-motion mode of the smartphone (Fig. 1e  www.nature.com/scientificreports/ holograms (Fig. 1e ii ; .tiff). Each RGB hologram is cropped and converted into grayscale (Fig. 1e iii ; .tiff; 700 × 700). In each hologram, background noises are commonly formed due to unintended external vibrations and dust particles attached on optical components. A background image is obtained by ensemble averaging of consecutive holographic images. Then, the signal-to-noise ratio (SNR) of the holographic speckle patterns of test samples is enhanced by subtracting spatially invariant background noises. The holographic speckle patterns of polydisperse PM samples are composed of speckle signals of PM c and PM f . Each speckle signal in a recorded hologram can be separated by the difference in their spatial frequencies.
First, a 2D fast Fourier transform (2D FFT) is used to convert a hologram from the spatial domain to frequency domain. 2D Gaussian pass filter is then adopted to eliminate the high or low spatial frequency components in the frequency domain. The Gaussian low-pass filter (LPF) and high-pass filter (HPF) are defined as follows: where R is the filter size, and u and v are the coordinates in the frequency domain. The spatial frequency of PM f speckles is usually higher than that of PM c speckles. Therefore, PM f and PM c speckles can be selectively extracted from the original hologram by applying the HPF and LPF, respectively. Thereafter, an inverse 2D FFT is used to convert the filtered image from the frequency domain to the spatial domain ( Fig. 1e iv,v ). Thereafter, the intensity shifting and contrast enhancement methods are applied to emphasize the representative features of holographic speckle patterns. Each hologram is then segmented into 10 × 10 pieces with a physical dimension of 70 × 70 pixels www.nature.com/scientificreports/ to prevent data dissipation of small speckle signals by enlarging their relative sizes compared with segmented image sizes (Fig. 1e vi ). Without image segmentation, the information of small speckle signals recorded in raw holograms is easily dissipated during data compression in the deep learning training process. Therefore, the relative sizes of speckle signals should be sufficiently enlarged to get high prediction accuracy (Fig. S1). The processed images and the corresponding ground truth PM concentration information are classified into the training, validation, and test datasets. Each class of training, validation, and test datasets consisting of 10,000, 20,000, and 30,000 segmented images is generated from 100, 200, and 300 sequential holograms, respectively.
Calculation of optical characteristics of holographic speckle patterns. Optical characteristics of six different datasets, including the high-pass filtered and low-pass filtered speckle images of monodisperse PM f , monodisperse PM c , and PM mixture are investigated. First, the intensity gradient is given by the variations in the pixel intensity (I) between adjacent pixels 45 . The mean intensity gradient of the spackle pattern recorded in an image is calculated as follows: where x and y represent the discrete image coordinates with M × N pixels. The pixel intensity is distributed between zero (black) to one (white) in an image. Second, the speckle size is the average pixel area occupied by the speckle pattern in the image 46 . The normalized intensity distributions of low-pass filtered and high-pass filtered holograms of monodisperse PM f and PM c are shown in Fig. S2. The average and standard deviation of pixel intensities contained in each case are calculated. The intensity threshold level is defined as the sum of the average and standard deviation of pixel intensities. The holographic images are then binarized using the intensity thresholding method. The "nnz" and "bwconncomp" functions of the MATLAB software are then utilized to extract the total pixel area and the total number of speckle pattern in each binarized image, respectively. Thereafter, the mean speckle size is obtained by dividing the total pixel area with the total number of speckles. Third, the speckle width represents the mean thickness of speckle patterns calculated with the normalized autocovariance function from the intensity distribution in the image 47 . The normalized autocovariance function is defined as follows: where FFT and FFT −1 are the fast Fourier transform (FFT) and inverse FFT, respectively. The speckle width is evaluated from the full width at half maximum (FWHM) of the normalized f(x,y). Thereafter, the horizontal and vertical widths are averaged to obtain the mean speckle width contained in the image. Forth, the 2D spatial frequency distributions of speckle patterns are acquired by applying the FFT function on the speckle images 33 . The power spectral density profiles of horizontal and vertical spatial frequency components are obtained from the frequency domain of the transformed array. The mean spatial frequency is then calculated by averaging the mean horizontal and vertical spatial frequencies.
Hyperparameters of deep learning algorithms. Due to the limited performance of an embedded smartphone central processing unit (CPU), the number of neurons consisting of DAE is optimized to maximize PM measurement accuracy and minimize computational cost. More detailed spatial features of speckle pattern are extracted from input images with the increase of the number of neurons, while the computational cost also largely increases. The optimum size of DAE is investigated to effectively extract the common traits of speckle patterns with varying sizes and concentrations of PMs. Therefore, the DAE structure is composed of an encoder (4900 × 512 × 256 × 128 × 64 neurons) and a decoder (64 × 128 × 256 × 512 × 4900 neurons). The Adam optimizer and sigmoid transfer function are used to minimize the root mean square error (RMSE) between the input and reconstructed images. The RMSE is calculated by averaging pixelwise residuals between the two images, and obtaining the square root of the mean. The batch size, epochs, and learning rate of DAE are set to 4096, 5000, and 10 −3 , respectively. The regression layers (64 × 256 × 256 × 1 neurons) are trained with the main features extracted from the latent space of DAE and the corresponding PM concentration labels. Gradient descent optimizer and rectified linear unit activation function are utilized to minimize the mean absolute error (MAE) between the PM prediction values and PM concentration labels. The MAE is calculated by averaging absolute errors between the predicted and ground truth values. The batch size, epochs, and learning rate of the regression layers are set to 8192, 10,000, and 10 −7 , respectively. The weights and biases in the deep learning algorithms are iteratively updated to find the global minima of loss functions based on a stochastic gradient descent method. The contrast of speckle pattern images is enhanced using the Python imaging library. MATLAB R2021a software is utilized to quantitatively analyze various optical characteristics of speckle patterns. Statistical data analyzed by using ANOVA are expressed as the mean ± standard deviation.

Results
Optical characteristics of holographic speckle patterns. A collimated incident laser beam is scattered by a test particle and the scattered beam interferes with a reference beam. Interference fringe pattern varies depending on the physical properties of the test particle, such as its size, shape, refractive index, and distance from the focal plane. In a turbid medium with highly-concentrated suspended particles, numerous interference fringe patterns are generated. Multiple scattering phenomena occur owing to the presence of adjacent particles and holographic speckle patterns are formed 43 . Holographic speckle pattern largely depends on the physical properties of test particles suspended in a volume. Figure 2a shows typical holographic speckle patterns according to the ground truth concentration values of PM f (ρ f ) and PM c (ρ c ). Various optical characteristics of PM speckles according to their concentration are quantitatively analyzed, including mean intensity gradient 45 , speckle size 46 , speckle width 47 , and spatial frequency 33 (Fig. 2b-e, Tables S1-4). The x-axes of monodisperse PM f (PM f _HPF, PM f _LPF) and high-pass filtered PM mixture (PM mixture_HPF) speckles stand for PM f concentration, while those of monodisperse PM c (PM c _HPF, PM c _LPF) and low-pass filtered PM mixture (PM mixture_LPF) speckles represent PM c concentration. The filter size R is set as 20. The intensity gradient and size of speckles tend to increase with PM concentration. On the other hand, the intensity gradient of highpass filtered speckles of monodisperse PM c shows no relationship with PM concentrations. Speckle width of high-pass filtered datasets and the corresponding PM concentration exhibit a positive correlation with large deviations. Among the low-pass filtered speckles, only monodisperse PM c dataset shows a positive relationship between its speckle width and PM concentration. The speckles filtered with HPF and LPF largely differ in their spatial frequencies.
Prediction of PM f concentrations. The 41 . Figure 4a shows the flow chart of PM f concentration measurement. A gray-scale speckle image of monodisperse PM f is filtered with HPF. The intensity distribution of the high-pass filtered image is biased to black color (pixel intensity = 0). Therefore, the intensity distribution is shifted to white color (pixel intensity = 255) by intensity shifting parameter S to enhance the deep learning training performance. Contrast enhancement and image segmentation are then applied to acquire trainable datasets. The deep learning network is trained with the processed PM f speckles and the corresponding labels. PM f concentration labels of 1, 3, 6, and 8 µg/ml are used as the training datasets. After training, the identical digital image processing is applied to speckle images of PM mixture to predict their PM concentrations.
Hyperparameters related with the Gaussian filtering (R), intensity shifting (S), and contrast enhancement (C) are optimized to minimize the RMSE of test datasets (Fig. 4b-d). The optimum value of one hyperparameter is found by iteratively changing it while other hyperparameters are fixed with their optimum values. The optimum S and C values are found to be 50 and 7, respectively. For the Gaussian filtering, the optimum R values of monodisperse PM f and PM mixture are 15 and 20, respectively. The Gaussian filter size R is set to 20, because the main objective of this study is to measure the concentrations of polydisperse PMs. The mean PM concentration value for one input image is obtained by averaging the predicted PM concentration values of 100 segments of the input image. For each class, the mean PM concentrations of sequential images are averaged again to acquire the final predicted value of the class. As a result, the trained model can predict PM f concentrations with a measurement error of 14.1 ± 16.7% (Fig. 4e, Table S5). The PM f concentrations of PM mixture are also predicted with a measurement error of 13.5 ± 9.8% (Fig. 4f, Table S6). Since the PDI of PM f samples is relatively higher than that of PM c , additional measurement errors might be induced by the variation in the size of PM f suspended in the monodisperse PM f and PM mixture samples. With the aid of HPF, the RMSE of predicted the PM f concentrations decreases by approximately 55.7%. These results imply that the speckle signal of PM f can be effectively extracted from that of PM mixture using HPF. Figure 5a shows the flow chart of PM c concentration measurement.

Prediction of PM c concentrations.
The shifting parameter S is set to zero, because the mean pixel intensity of a low-pass filtered image is already placed around the middle of black and white color (pixel intensity = 128). After applying contrast enhancement and image segmentation procedures, the deep learning network is trained with the processed PM c speckles and their corresponding labels. PM c concentration labels of 1, 3, and 8 µg/ml are used as the training datasets. For monodisperse PM c , the optimum C values of the validation and test datasets are 5 and 3, respectively (Fig. 5b). The total RMSE of two datasets is minimized at C = 3, while the prediction accuracy in the concentration range of 1-2 µg/ml is better at C = 5 (Fig. S3). Therefore, the optimum C is set to 5. The optimum R values of the validation and test datasets are 20 and 15, respectively (Fig. 5c). The optimum R is set to 20, which provides a high prediction accuracy in the high concentration range of 6 -8 µg/ml (Fig. S4). As a result, the trained model predicts PM c concentrations with a measurement error of 14.3 ± 15.4% (Fig. 5f, Table S7). However, without the correction process, the model fails to predict PM c concentrations of PM mixture, as shown in Fig. 5g with gray rhombus symbols (Table S8). www.nature.com/scientificreports/ www.nature.com/scientificreports/ Incorrect predictions of PM c concentrations might occur because the low-pass filtered images of PM c and PM f look similar (Fig. 2). When low-pass filtered images of monodisperse PM f are fed to the model trained with monodisperse PM c , the trained model incorrectly predicts PM f concentrations as PM c concentrations (Y) (Fig. 5d). The linear regression line of the mispredicted PM c concentrations can be expressed as Y = AX + B, where X indicates the PM f concentration. This result implies that LPF cannot fully filter PM f signals from speckle images of PM mixture. Therefore, the PM c concentrations predicted from low-pass filtered images of PM mixture (P) need to be corrected by subtracting the mispredicted PM c concentration Y. In this study, two speckle signals of PM f and PM c acquired from PM mixture are assumed to overlap for providing a superposition signal. When PM f and PM c coexist in a 3D volume, multiple scattering phenomena occur successively by the presence of adjacent particles. However, an analytical estimation of the optical properties of the superposition signal is difficult to obtain due to the unpredictable volume scattering phenomena induced by polydisperse non-spherical PMs 43 . Therefore, the correction coefficient (k) should be multiplied to Y to consider the nonlinear superposition of two speckle signals of PM f and PM c . The appropriate correction coefficient is estimated by searching the minimum RMSE of the corrected PM c concentrations (P c = P -kY) (Fig. 5e). Although the global minimum RMSE occurs at R = 15, the PM f concentrations (X) should be accurately measured in advance. Therefore, the optimum k is set to 0.57 at R = 20. After the correction process, the performance of PM c concentration prediction improves with a measurement error of 22.8 ± 18.1%, as illustrated in Fig. 5g with blue triangle symbols (Table S8).

Discussion
Compared with conventional measurement techniques for particles, the proposed S-DHM device has a compact, cost-effective optical alignment composed of a 3D printed housing, an inexpensive laser diode, and commercial optical lenses. The proposed S-DHM device overcomes the concentration limit of commercially available hand-held particle counters used for PM monitoring (Table S9). The predictable range of PM size is limited by the spatial resolution (Δ = 0.61λ/NA = 811.3 nm) and the field-of-view (175 × 175 µm 2 ) of the proposed S-DHM device. Since the magnified pixel length of the S-DHM device is 250 nm, it is possible for the image sensor of the smartphone to record PM signals up to the spatial resolution. The spatial resolution can be further enhanced with increasing the numerical aperture (NA) of the objective lens and decreasing the wavelength of the laser diode. By reducing the magnification of the objective lens and using an up-to-date smartphone whose image sensor has a larger number of pixels, the field-of-view of the S-DHM device can be further expanded. However, this kind of hardware upgrade will increase the total cost of the optical arrangement.
Recently, the specifications of hand-held smartphones have been remarkably improved. For example, the resolution of a smartphone image sensor exceeds 10 megapixels. The performance of a CPU embedded in a smartphone is sufficient to compute digital image processing and operate pretrained deep learning models using TensorFlow Lite. This implies that a smartphone can be utilized to replace the expensive camera of a conventional DHM system. However, the prediction accuracy of the S-DHM system still has technical limitations. Given that the exposure time of a smartphone camera is over 42 µs, the power of a laser diode cannot exceed 20 mW to avoid pixel saturation. The weak-powered laser beam generates holographic speckle patterns with a low SNR. The random motions of suspended PMs cause continuous fluctuations in the intensity of background noises. Therefore, sufficient removal of the background noises an ensemble averaging method becomes difficult, and the remaining background noises contained in all datasets with low SNRs can reduce the prediction accuracy. www.nature.com/scientificreports/ www.nature.com/scientificreports/  www.nature.com/scientificreports/ As a result, the predicted PM concentrations in the range of 1-2 µg/ml is found to be higher than the target value, whereas those of 7-8 µg/ml is lower (Figs. 4e, 5f). The accuracy of the proposed S-DHM device can be enhanced in the near future with further improvements in the hardware of smartphone camera and laser power. Although the proposed device can measure PM f concentrations of both monodisperse PM f and PM mixture samples with a moderate accuracy, the measurement of PM c concentration remains somewhat limited. Given the overlapping spatial frequency distributions of low-pass filtered PM f and PM c speckles, it is hard to fully separate the two speckles using a 2D Gaussian filter. To filter PM f speckles with lower spatial frequency, the filter size should be increased. However, since some PM c speckles are lost together, the measurement error rather increases (Fig. 5c,e). Therefore, an additional correction process is required. Due to the complex volume scattering phenomena of PMs, it is difficult to verify that the correction coefficient (k = 0.57) is theoretically acceptable. For example, the total RMSE of the measured PM c concentrations is largely reduced by the adoption of correction process, while the measurement errors of several datasets with high PM c concentration labels rather increases (Table S8). It implies that the correction coefficient (k) may have nonlinear relationships with the PM concentration ratios. For further improvement of the correction process, the correlation between the correction coefficient (k) and PM concentrations should be investigated in detail. To increase the prediction accuracy of PM c concentration without correction process, there is an option to generate input datasets of all combinations of polydisperse PM mixtures and train deep learning algorithms. However, this approach might require cumbersome sample preparation and high computational cost compared with the proposed method.

Conclusions
In summary, a hand-held smartphone-based device for cost-effective monitoring of suspended PMs in water is developed by integrating a S-DHM system and deep learning network. The developed device can selectively measure mass concentrations of PM f and PM c from a holographic speckle image of polydisperse PM mixture. The predictable particle size and concentration range of PMs can be extended by training the deep learning algorithms with additional datasets. The proposed PM monitoring technique can be applied to various micro-scale pollutants suspended in diverse media, such as microplastics and airborne PMs. This study would contribute to the development of compact hand-held devices for real-time monitoring of environmental quality with a portable smartphone.

Data availability
All data used to evaluate the results in the paper are present in the main text and/or the Supplementary Information. Additional data related to this paper may be available from the authors.